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(1) GENERAL INFORMATION: 

(i) APPLICANT: Benz, Christopher C. 

Scott, Gary K. 
Chang , Chuan - H s iung 

(ii) TITLE OF INVENTION: A New ETS -Related Gene Overexpressed in 
Human Breast and Epithelial Cancers 

(iii) NUMBER OF SEQUENCES: 38 

(iv) CORRESPONDENCE ADDRESS : ^. 

(A) ADDRESSEE : Townsend and Townsend arid; Crew LLP 

(B) STREET: Two Embarcadero Center, Eighth Floor 

(C) CITY: San Francisco 

(D) STATE: California 

(E) COUNTRY: USA 

x (F) ZIP: 94111-3834 

I - J 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patent In Release #1.0, Version #1.3 0 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/978,217 

(B) FILING DATE: 25-NOV-1997 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 60/031,504 

(B) FILING DATE: 27-NOV-1996 

(viii) ATTORNEY/ AGENT INFORMATION: 

(A) NAME: Hunter, Tom 

(B) REGISTRATION NUMBER: 38,498 

(C) REFERENCE/DOCKET NUMBER: 023 07E- 071110US 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (415) 576-0200 

(B) TELEFAX: (415) 576-0300 




(2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1116 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1 . . 1116 

(D) OTHER INFORMATION: /product= "human ESX" 
/note= "epithelial-restricted with serine box (ESX) " 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 

ATG GCT GCA ACC TGT GAG ATT AGC AAC ATT TTT AGC AAC TAC TTC AGT 4 8 

Met Ala Ala Thr Cys Glu lie Ser Asn lie Phe Ser Asn Tyr Phe Ser 

15 10 15 

GCG ATG TAC AGC TCG GAG GAC TCC ACC CTG GCC TCT GTT CCC CCT GCT 96 

Ala Met Tyr Ser Ser Glu Asp Ser Thr Leu Ala Ser Val Pro Pro Ala 
20 25 30 

GCC ACC TTT GGG GCC GAT GAC TTG GTA CTG ACC CTG AGC AAC CCC CAG 144 

Ala Thr Phe Gly Ala Asp Asp Leu Val Leu Thr Leu Ser Asn Pro Gin 
35 40 45 

ATG TCA TTG GAG GGT ACA GAG AAG GCC AGC TGG TTG GGG GAA CAG CCC 192 

Met Ser Leu Glu Gly Thr Glu Lys Ala Ser Trp Leu Gly Glu Gin Pro 
50 55 60 

CAG TTC TGG TCG AAG ACG CAG GTT CTG GAC TGG ATC AGC TAC CAA GTG 24 0 

Gin Phe Trp Ser Lys Thr Gin Val Leu Asp Trp lie Ser Tyr Gin Val 

65 70 75 80 

GAG AAG AAC AAG TAC GAC GCA AGC GCC ATT GAC TTC TCA CGA TGT GAC 2 88 

Glu Lys Asn Lys Tyr Asp Ala Ser Ala lie Asp Phe Ser Arg Cys Asp 

85 90 95 

ATG GAT GGC GCC ACC CTC TGC AAT TGT GCC CTT GAG GAG CTG CGT CTG 33 6 

Met Asp Gly Ala Thr Leu Cys Asn Cys Ala Leu Glu Glu Leu Arg Leu 
100 105 110 

GTC TTT GGG CCT CTG GGG GAC CAA CTC CAT GCC CAG CTG CGA GAC CTC 384 

Val Phe Gly Pro Leu Gly Asp Gin Leu His Ala Gin Leu Arg Asp Leu 
115 120 125 

ACT TCC AGC TCT TCT GAT GAG CTC AGT TGG ATC ATT GAG CTG CTG GAG 432 

Thr Ser Ser Ser Ser Asp Glu Leu Ser Trp lie lie Glu Leu Leu Glu 
130 135 140 

AAG GAT GGC ATG GCC TTC CAG GAG GCC CTA GAC CCA GGG CCC TTT GAC 4 80 

Lys Asp Gly Met Ala Phe Gin Glu Ala Leu Asp Pro Gly Pro Phe Asp 

145 150 155 160 

CAG GGC AGC CCC TTT GCC CAG GAG CTG CTG GAC GAC GGT CAG CAA GCC 52 8 

Gin Gly Ser Pro Phe Ala Gin Glu Leu Leu Asp Asp Gly Gin Gin Ala 

165 170 175 

AGC CCC TAC CAC CCC GGC AGC TGT GGC GCA GGA GCC CCC TCC CCT GGC 576 

Ser Pro Tyr His Pro Gly Ser Cys Gly Ala Gly Ala Pro Ser Pro Gly 
180 185 190 

AGC TCT GAC GTC TCC ACC GCA GGG ACT GGT GCT TCT CGG AGC TCC CAC 624 

Ser Ser Asp Val Ser Thr Ala Gly Thr Gly Ala Ser Arg Ser Ser His 
195 200 205 

TCC TCA GAC TCC GGT GGA AGT GAC GTG GAC CTG GAT CCC ACT GAT GGC 672 

Ser Ser Asp Ser Gly Gly Ser Asp Val Asp Leu Asp Pro Thr Asp Gly 
210 215 220 

AAG CTC TTC CCC AGC GAT GGT TTT CGT GAC TGC AAG AAG GGG GAT CCC 72 0 

Lys Leu Phe Pro Ser Asp Gly Phe Arg Asp Cys Lys Lys Gly Asp Pro 

225 230 235 240 

AAG CAC GGG AAG CGG AAA CGA GGC CGG CCC CGA AAG CTG AGC AAA GAG 7 68 

Lys His Gly Lys Arg Lys Arg Gly Arg Pro Arg Lys Leu Ser Lys Glu 

245 250 255 
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TAC TGG GAC TGT CTC GAG GGC AAG AAG AGC AAG CAC GCG CCC AGA GGC 816 
Tyr Trp Asp Cys Leu Glu Gly Lys Lys Ser Lys His Ala Pro Arg Gly 
260 265 270 

ACC CAC CTG TGG GAG TTC ATC CGG GAC ATC CTC ATC CAC CCG GAG CTC 8 64 

Thr His Leu Trp Glu Phe lie Arg Asp lie Leu lie His Pro Glu Leu 
275 280 285 

AAC GAG GGC CTC ATG AAG TGG GAG AAT CGG CAT GAA GGC GTC TTC AAG 912 
Asn Glu Gly Leu Met Lys Trp Glu Asn Arg His Glu Gly Val Phe Lys 
290 295 300 

TTC CTG CGC TCC GAG GCT GTG GCC CAA CTA TGG GGC CAA AAG AAA AAG 960 
Phe Leu Arg Ser Glu Ala Val Ala Gin Leu Trp Gly Gin Lys Lys Lys 
305 310 315 320 

AAC AGC AAC ATG ACC TAC GAG AAG CTG AGC CGG GCC ATG AGG TAC TAC 10 08 

Asn Ser Asn Met Thr Tyr Glu Lys Leu Ser Arg Ala Met Arg Tyr Tyr 
325 330 335 

TAC AAA CGG GAG ATC CTG GAA CGG GTG GAT GGC CGG CGA CTC GTC TAC 1056 
Tyr Lys Arg Glu lie Leu Glu Arg Val Asp Gly Arg Arg Leu Val Tyr 
340 345 350 

AAG TTT GGC AAA AAC TCA AGC GGC TGG AAG GAG GAA GAG GTT CTC CAG 1104 
Lys Phe Gly Lys Asn Ser Ser Gly Trp Lys Glu Glu Glu Val Leu Gin 
355 360 365 

AGT CGG AAC TGA 1116 
Ser Arg Asn 
370 

(2) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 371 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

Met Ala Ala Thr Cys Glu lie Ser Asn lie Phe Ser Asn Tyr Phe Ser 
15 10 15 

Ala Met Tyr Ser Ser Glu Asp Ser Thr Leu Ala Ser Val Pro Pro Ala 
20 25 30 

Ala Thr Phe Gly Ala Asp Asp Leu Val Leu Thr Leu Ser Asn Pro Gin 
35 40 45 

Met Ser Leu Glu Gly Thr Glu Lys Ala Ser Trp Leu Gly Glu Gin Pro 
50 55 60 

Gin Phe Trp Ser Lys Thr Gin Val Leu Asp Trp lie Ser Tyr Gin Val 
65 70 75 80 

Glu Lys Asn Lys Tyr Asp Ala Ser Ala lie Asp Phe Ser Arg Cys Asp 
85 90 95 

Met Asp Gly Ala Thr Leu Cys Asn Cys Ala Leu Glu Glu Leu Arg Leu 
100 105 110 

Val Phe Gly Pro Leu Gly Asp Gin Leu His Ala Gin Leu Arg Asp Leu 
115 120 125 
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Thr Ser Ser Ser Ser Asp Glu Leu Ser Trp lie lie Glu Leu Leu Glu 
130 135 140 

Lys Asp Gly Met Ala Phe Gin Glu Ala Leu Asp Pro Gly Pro Phe Asp 
145 150 155 160 

Gin Gly Ser Pro Phe Ala Gin Glu Leu Leu Asp Asp Gly Gin Gin Ala 
165 170 175 

Ser Pro Tyr His Pro Gly Ser Cys Gly Ala Gly Ala Pro Ser Pro Gly 
180 185 190 

Ser Ser Asp Val Ser Thr Ala Gly Thr Gly Ala Ser Arg Ser Ser His 
195 200 205 

Ser Ser Asp Ser Gly Gly Ser Asp Val Asp Leu Asp Pro Thr Asp Gly 
210 215 220 

Lys Leu Phe Pro Ser Asp Gly Phe Arg Asp Cys Lys Lys Gly Asp Pro 
225 230 235 240 

Lys His Gly Lys Arg Lys Arg Gly Arg Pro Arg Lys Leu Ser Lys Glu 
245 250 255 

Tyr Trp Asp Cys Leu Glu Gly Lys Lys Ser Lys His Ala Pro Arg Gly 
260 265 270 

Thr His Leu Trp Glu Phe lie Arg Asp lie Leu lie His Pro Glu Leu 
275 280 285 

Asn Glu Gly Leu Met Lys Trp Glu Asn Arg His Glu Gly Val Phe Lys 
290 295 300 

Phe Leu Arg Ser Glu Ala Val Ala Gin Leu Trp Gly Gin Lys Lys Lys 
305 310 315 320 

Asn Ser Asn Met Thr Tyr Glu Lys Leu Ser Arg Ala Met Arg Tyr Tyr 
325 330 335 

Tyr Lys Arg Glu lie Leu Glu Arg Val Asp Gly Arg Arg Leu Val Tyr 
340 345 350 

Lys Phe Gly Lys Asn Ser Ser Gly Trp Lys Glu Glu Glu Val Leu Gin 
355 360 365 

Ser Arg Asn 
370 

(2) INFORMATION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1907 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 96.. 1211 

.(D) OTHER INFORMATION: /product= "human ESX" 
/note= "epithelial-restricted with serine box (ESX) " 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 

CGGC CAGATA CCTCAGCGCT ACCTGGCGGA ACTGGATTTC TCTCCCGCCT GCCGGCCTGC 60 

CTGCCACAGC CGGACTCCGC CACTCCGGTA GCCTCATGGC TGCAACCTGT GAGATTAGCA 12 0 

ACATTTTTAG CAACTACTTC AGTGCGATGT ACAGCTCGGA GGACTCCACC CTGGCCTCTG 180 

TTCCCCCTGC TGCCACCTTT GGGGC CGATG ACTTGGTACT GACCCTGAGC AACCCCCAGA 24 0 

TGTCATTGGA GGGTACAGAG AAGGCCAGCT GGTTGGGGGA ACAGCCCCAG TTCTGGTCGA 3 00 

AGACGCAGGT TCTGGACTGG ATCAGCTACC AAGTGGAGAA GAACAAGTAC GACGCAAGCG 3 60 

CCATTGACTT CTCACGATGT GACATGGATG GCGCCACCCT CTGCAATTGT GCCCTTGAGG 42 0 

AGCTGCGTCT GGTCTTTGGG CCTCTGGGGG ACCAACTCCA TGCCCAGCTG CGAGACCTCA 480 

CTTCCAGCTC TTCTGATGAG CTCAGTTGGA TCATTGAGCT GC TGGAGAAG GATGGCATGG 540 

CCTTCCAGGA GGCCCTAGAC CCAGGGCCCT TTGACCAGGG CAGCCCCTTT GCCCAGGAGC 600 

TGCTGGACGA CGGTCAGCAA GCCAGCCCCT ACCACCCCGG CAGCTGTGGC GCAGGAGCCC 660 

CCTCCCCTGG CAGCTCTGAC GTCTCCACCG CAGGGACTGG TGCTTCTCGG AGCTCCCACT 72 0 

CCTCAGACTC CGGTGGAAGT GACGTGGACC TGGATCCCAC TGATGGCAAG CTCTTCCCCA 780 

GCGATGGTTT TCGTGACTGC AAGAAGGGGG ATCCCAAGCA CGGGAAGCGG AAACGAGGCC 84 0 

GGCCCCGAAA GCTGAGCAAA GAGTACTGGG ACTGTCTCGA GGGCAAGAAG AGCAAGCACG 900 

CGCCCAGAGG CACCCACCTG TGGGAGTTCA TCCGGGACAT CCTCATCCAC CCGGAGCTCA 960 

ACGAGGGCCT CATGAAGTGG GAGAATCGGC ATGAAGGCGT CTTCAAGTTC CTGCGCTCCG 102 0 

AGGCTGTGGC CCAACTATGG GGC C AAAAGA AAAAGAACAG CAACATGACC TACGAGAAGC 1080 

TGAGCCGGGC CATGAGGTAC TACTACAAAC GGGAGATCCT GGAACGGGTG GATGGCCGGC 114 0 

GACTCGTCTA CAAGTTTGGC AAAAACTCAA GCGGCTGGAA GGAGGAAGAG GTTCTCCAGA 12 00 

GTCGGAACTG AGGGTTGGAA CTATAC CCGG GACCAAACTC ACGGACCACT CGAGGCCTGC 1260 

AAACCTTCCT GGGAGGACAG GCAGGCCAGA TGGCCCCTCC ACTGGGGAAT GCTCCCAGCT 132 0 

GTGC TGTGGA GAGAAGCTGA TGTTTTGGTG TATTGTCAGC CATCGTCCTT GGACTCGGAG 13 80 

ACTATGGCCT CGCCTCCCCA CCCTCCTCTT GGAATTACAA GCCCTGGGGT TTGAAGCTGA 1440 

CTTTATAGCT GCAAGTGTAT CTCCTTTTAT CTGGTGCCTC CTCAAACCCA GTCTCAGACA 1500 

CTTAAATGCA GACAACACCT TCTTCCTGCA GACACTTGGA CTGAGC CAAG GAGGC TTGGG 1560 

AGGCCCTAGG GAGCACCGTG ATGGAGAGGA CAGAGCAGGG GCTCCAGCAC TTCTTTCTGG 162 0 

ACTGGCGTTC ACCTCCCTGC TCAGTGCTTG GGCTCCACGG GCAGGGGTCA GAGCACTCCC 1680 

TAATTTATGT GCTATATAAA TATGTCAGAT GTACATAGAG AT C T ATTTTT TCTAAAACAT 174 0 

TCCCCTCCCC ACTCCTCTCC CACAGAGTGC TGGACTGTTC CAGGCCCTCC AGTGGGCTGA 1800 

TGCTGGGACC CTTAGGATGG GGCTCCCAGC TCCTTTCTCC TGTGAATGGA GGCAGAGACC 1860 

TCCAATAAAG TGCCTTCTGG GCTTTTTCTA AAAAAAAAAA AAAAAAA 1907 
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(2) INFORMATION FOR SEQ ID NO : 4 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 189 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(ix) FEATURE: 

(A) NAME /KEY : - 

(B) LOCATION: 1..189 

(D) OTHER INFORMATION: /note= "first variable region 
(nucleotides 1-189 of SEQ ID NO:l)" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 

ATGGCTGCAA CCTGTGAGAT TAGCAACATT TTTAGCAACT ACTTCAGTGC GATGTACAGC 6 0 

TCGGAGGACT CCACCCTGGC CTCTGTTCCC CCTGCTGCCA CCTTTGGGGC CGATGACTTG 12 0 

GTACTGACCC TGAGCAACCC CCAGATGTCA TTGGAGGGTA C AG AG AAGG C CAGCTGGTTG 180 

GGGGAACAG 189 
(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 120 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(ix) FEATURE: 

(A) NAME /KEY: - 

(B) LOCATION: 1..120 

(D) OTHER INFORMATION: /note= "pointed region (nucleotides 
190-309 Of SEQ ID NO:l) M 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 
CCCCAGTTCT GGTCGAAGAC GCAGGTTCTG GACTGGATCA GCTAC CAAGT GGAGAAGAAC 60 
AAGTACGACG CAAGCGC CAT TGACTTCTCA CGATGTGACA TGGATGGCGC CACCCTCTGC 12 0 

(2) INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 252 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 
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(ix) FEATURE: 

(A) NAME / KEY : - 

(B) LOCATION: 1..252 

(D) OTHER INFORMATION: /note= "second variable region 
(nucleotides 310-561 of SEQ ID NO:l) n 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 

AATTGTGCCC TTGAGGAGCT GCGTCTGGTC TTTGGGCCTC TGGGGGAC C A ACTCCATGCC 6 0 

CAGCTGCGAG ACCTCACTTC CAGCTCTTCT GATGAGCTCA GTTGGATCAT TGAGCTGCTG 120 

GAGAAGGATG GCATGGCCTT CCAGGAGGCC CTAGACCCAG GGCCCTTTGA CCAGGGCAGC 180 

CCCTTTGCCC AGGAGCTGCT GGACGACGGT CAGCAAGCCA GCCCCTACCA CCCCGGCAGC 24 0 

TGTGGCGCAG GA 2 52 

(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 84 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(ix) FEATURE: 

(A) NAME /KEY : Peptide 

(B) LOCATION: 1 . . 84 

(D) OTHER INFORMATION: /note= "second variable region 
(amino acids 104-187 of SEQ ID NO:2)" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 

Asn Cys Ala Leu Glu Glu Leu Arg Leu Val Phe Gly Pro Leu Gly Asp 
15 10 15 

Gin Leu His Ala Gin Leu Arg Asp Leu Thr Ser Ser Ser Ser Asp Glu 
20 25 30 

Leu Ser Trp lie lie Glu Leu Leu Glu Lys Asp Gly Met Ala Phe Gin 
35 40 45 

Glu Ala Leu Asp Pro Gly Pro Phe Asp Gin Gly Ser Pro Phe Ala Gin 
50 55 60 

Glu Leu Leu Asp Asp Gly Gin Gin Ala Ser Pro Tyr His Pro Gly Ser 
65 70 75 80 

Cys Gly Ala Gly 

(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 153 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 
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(ix) FEATURE: 

(A) NAME/ KEY : - 

(B) LOCATION: 1. .153 

(D) OTHER INFORMATION: /note= " serine -rich region 
(nucleotides 562-714 of SEQ ID NO:l)" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 8 : 

GCCCCCTCCC CTGGCAGCTC TGACGTCTCC ACCGCAGGGA CTGGTGCTTC TCGGAGCTCC 6 0 

CACTCCTCAG AC TC CGGTGG AAGTGACGTG GACCTGGATC CCACTGATGG CAAGCTCTTC 12 0 

CCCAGCGATG GTTTTCGTGA CTGCAAGAAG GGG 153 

(2) INFORMATION FOR SEQ ID NO : 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 105 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOP.OLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: 1. .105 

(D) OTHER INFORMATION: /note= "third variable region 
(nucleotides 715-819 of SEQ ID NO:l)" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 : 
GATCCCAAGC ACGGGAAGCG GAAACGAGGC CGGCCCCGAA AGCTGAGCAA AGAGTACTGG 60 
GACTGTCTCG AGGGCAAGAA GAGCAAGCAC GCGCCCAGAG GCACC 10 5 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 243 base pairs 
. (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE .TYPE : DNA 

(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: 1 . . 243 

(D) OTHER INFORMATION: /note= "Ets DNA binding domain 
(nucleotides 820-1062 of SEQ ID NO:l)» 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

CACCTGTGGG AGTTCATCCG GGACATCCTC ATCCACCCGG AGCTCAACGA GGGCCTCATG 6 0 

AAGTGGGAGA ATCGGCATGA AGGCGTCTTC AAGTTCCTGC GCTCCGAGGC TGTGGCCCAA 12 0 

CTATGGGGCC AAAAGAAAAA GAACAGCAAC ATGACCTACG AGAAGCTGAG CCGGGCCATG 18 0 
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AGGTACTACT ACAAACGGGA GATCCTGGAA CGGGTGGATG GCCGGCGACT CGTCTACAAG 240 
TTT 243 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 51 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(ix) FEATURE: 

(A) NAME /KEY : - 

(B) LOCATION: 1. .51 

(D) OTHER INFORMATION: /note= "fourth variable region 
(nucleotides 1063-1113 of SEQ ID NO:l)" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:ll: 
GGCAAAAACT CAAGCGGCTG GAAGGAGGAA GAGGTTCTCC AGAGTCGGAA C 51 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 amino acids 

(B) TYPE : amino acid 

( C ) STRANDEDNES S : 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(ix) FEATURE: 

(A) NAME / KEY : Peptide 

(B) LOCATION: 1 . . 16 

(D) OTHER INFORMATION: /note= "C- terminal 16 amino acids 
(amino acids 356-371 or SEQ ID NO: 2)" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Lys Asn Ser Ser Gly Trp Lys Glu Glu Glu Val Leu Gin Ser Arg Asn 
1 5 10 15 

(2) INFORMATION FOR SEQ ID NO: 13: 

. (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

( ix) FEATURE : 

(A) NAME/KEY: - 

(B) LOCATION: 1..21 

(D) OTHER INFORMATION: /note= "5* ESX-DBD primer" 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
C CGGG AC AT C CTCATCCACC C 21 

(2) INFORMATION FOR SEQ ID NO : 14 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(ix) FEATURE: 

(A) NAME /KEY : - 

(B) LOCATION: 1. .21 

(D) OTHER INFORMATION: /note= M 3 ' ESX-DBD primer" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
GTACCTCATG GCCCGGCTCA G 21 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7752 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: j oin ( 3 6 04 . . 3 763 , 4152.. 4373, 4504.. 4599, 4788 

..4907, 5055.. 5144, 5287.. 5403, 6257.. 6452, 7001 
. . 7112) 

(D) OTHER INFORMATION: /product= "mouse ESX" 
/note= "mouse epithelial-restricted with serine box (ESX) " 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

GGATCCTTCC AAGGCACTGA CCTCACCCAA TTCTTTCTCA CTTTTCTCCT CCATTTAACT 60 

GTGGACGGAA TCAATACTCA GGGGGATGCG CTAGCTCTAA GATTTCTGCA GCTTTGCCTC 12 0 

TCCTGAGCGG AAGCCCCGTG AAGGCAAGGG AGCTAGCTGA TGGACTCTTT GTGGTCTTCT 18 0 

TCCTCTTTGC TCTGGAGACC CAACCAGGTG TTCTTAGGGG AAGGAGCACG TGAGTAGC C A 24 0 

AGAGGC T AAA AGCTGGTTCT CCCACATTCC AGGGTAAGTG AC TGGGT AGA GGGTGTGTCT 300 

GCCTCAGGCT GCTTGGAGGA GGTCCCCTGA AGGGC CATGA GAAAATCCTA CCCAGAGCCC 360 

TTGGTTTTCC AGCAGCCCTC CACCTAGAGG AAAGGAGCCT GTCGTTCTGA AGATGAAGAG 42 0 

TGGAGCCTAT GGGGGTGGGC AGATTGTGTC C TGGGAC AAT GGGGTACCTA GAAGAGAAAG 480 

GAATCTCCTT TCGTTTGAGG TCTACCTGGG GGTCGTGTGT CTGTAAATGG GGTGGAGAGA 54 0 
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GGAGAAGACA CAGATCTTAT AACGTAGATG CAGGAAATGC TGACAGTTCA GTGTAGAGAA 600 

CTTACTCAAT TCATATAGCC TCCAAAGCTA TCTCCTCAGG CAACGCAAAA CAAACCAGTT 660 

GGAGCCGCAA GACATCTAAT GGCTTATCGA GTCCCACACC CTCGATTCTT TGCTAATTTT 720 

ATGGTTTTGC TTTTGAGACA ATCTACTGTA GCCTAAGATA GCCCCAAACT CAAATGTAGC 7 80 

TGAGGC TGAC TGACCCTGAG CTCTGGAATT C C AGAC AC AT GCATATCTTT TGCTAGGCAA 84 0 

TAATCGCTCT ACCAGCTGTA CTCCCACATT CCAGGGTAAG TGACTGGAAT TCTCACTTAC 900 

TATATCCCTT TAAAAATTCC CTGAGTGGGA TGGTTGTAGC CAGAGGGAAA AGGCACCAAC 960 

AACTGCTTGT CACTTTCCAA ATTTGGTAGC CTGAACAAAC CACTTATCAA GACAACAACT 1020 

ATATATCATT TCTTTTCTTC TCTCTCTCTC TCTCTCTCTC TCTCTCTCTC TCTCTCTCTC 10 8 0 

TCTCTCTCTC TCTCTCTTTN GAAAGAGTCT CACTACTATG TAGCCCTTGA TAAC CTAGAA 114 0 

CTCACTATGT AGTCCAGGCT TGGCCTTCAG CTCGCAGAGG TCCACTTGCC TTGGGAGTTG 12 00 

AGAGATTAAA GGGATGCATC TCCACATGTG TCCAACAGTG CTTTTTAAAA ATATTTTTAA 1260 

AACCATGCTT ACAGCCAGGC ATAGTGGGCG TGCCTTTAAT CCCAGTACTG GGGAGGCAGA 132 0 

GGTAGGTAGA GTTCTGAGTT GGAGGCTAGC CACATAGTAA GTCCCAGGAT AGCTAGAACT 13 80 

ATGTAAAGAC CATGTCTCAA AAAAGATGCA CACACACATA TACACACACA CGTTTGTATG 144 0 

TGTTTGTTTA GTGTGTATGT GTGTGTACAC TTGCACATAA AGGTCAGAGT ACCACATTAC 15 0 0 

AGGAGTCAGT TTTCTCCTTT TAT C ATGTAT GGATGGAACA CGGGTCCATC CATAGCATCC 1560 

TTAGCAGCAG GTATCCTTAT CCACTGAGCT ATCTCAGCAG CCCCACATTG CTTATTGGAT 162 0 

GTTTTTGGAT GAGGATAGTT AT AT T AAAAA GGTTTCTGGT GTTGGTCTGG GTAGTT AC C C 1680 

TTTAACCCAT CTCTAGAGCC TGTCTCTTGA GTTTGAGGCC AGCCTGGTAT ATGTAGCTAG 174 0 

ACAAAGTTTC AAAAATGAAC AGAATCCTGG GACTAGAACC CATTTGTAGA ATGCTTGCAT 1800 

AAGAAGCTCT GGGTTCAACT TCCTGCATCT C C AG AGGG AT TTTGTTCTGT AGTTTTAGTT 1860 

TTTCAAGACA GAGTTTCTCT GTGTAGCCCT GGCTGTCCTG GAACTCACTC TGTAGACAAG 192 0 

GCTGGCCTCG AACTCAGAAA TCCTTCTACC TCTACTTCAG GAGTGCTGGG ATTAAAGATG 1980 

TGCGCTGCCC TCCTCCACCC CAATTTGTTT TTGTTTTTTA AGGGCCCCGG TAAACAGTAA 2040 

ATTAACATGT GCATCCTGTT TGTCTTTGTA ATGACTCAAA TGTTGGGCTT CTGACC AC TA 2100 

GAGGGCAGCA GGCAGATACT AATGGACTGG GCGGAGAGAA GGGTAATCAG GAGCAGACCA 2160 

GACTCGCGGA TAAACCAAAC AGCACCGCCA GCCGACCCTA GGCGAGGAGA GCGCCACAGG 2220 

C AC C AAGGG A AGACTTGAAG TAGTGTCTGA TCTCTACCGC TTCAGCAACC ATCGCGTTTG 22 80 

GGTGGGCTCC AGACAGGCAA AGTGC CAGCA AATGGTCCCT GTAGCTGACT AAACAGACTA 234 0 

TCAGACCCAA ACCACCACTG GACCGTGAAT GTTGCCCAGT GTGTTGCCTA GCCGCTTTCA 24 00 

GAATCCCAGC TTCTGGGTGT TGTGGAGGAA ACCCCTTAGC CTCGGTAACT TTCACCAGGC 24 60 

CCTTCTTGTC TCTAGACATC TAGACAGTTG GAAGCATCAG TCTTGACCCA GCCACCGGTT 2 52 0 

CAGATTCTTT GCCTTGCTTT TTCTTCCCCA GTTCAGCCCT GGCCAGGCCC CCAGGAAGAA 2580 
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GGGGGGTGTT 


GAAGAGAGAA 


3480 


TGTCTACAGC 


AACACTGAAC 


TTCCTGCCTC 


TCGGCTGTTG 


CTGCCCAGGC 


TTTGCCAGAC 


3540 


AGAAATGGAA 


GTGTATCCTG 


ACCTGTACCC 


TCCCCACCTT 


GTCTCCTCTT 


CCCAGGGGCC 


3600 


CTC ATG GCT GCC ACC TGT GAG ATC AGC AAC GTT 
Met Ala Ala Thr Cys Glu lie Ser Asn Val 


TTT AGT AAC TAC TTC 
Phe Ser Asn Tyr Phe 


3648 



15 10 15 



AAC GCC ATG TAC AGC TCA GAA GAC CCC ACC CTG GCT CCT GCT CCT CCG 3696 
Asn Ala Met Tyr Ser Ser Glu Asp Pro Thr Leu Ala Pro Ala Pro Pro 
20 25 30 

ACT ACC TTT GGC ACT GAA GAC TTG GTG TTG ACC CTG AAC AAC CAA CAG 3744 
Thr Thr Phe Gly Thr Glu Asp Leu Val Leu Thr Leu Asn Asn Gin Gin 
35 40 45 

ATG ACA CTG GAA GGT CCA G GTGAGTGCTG TGTAAAATCT TTTCAGACAG 37 93 
Met Thr Leu Glu Gly Pro 
50 

' GACACCAATG ATCTGAGAGG CTCTTAGATG ATAAATGGAC AGGGAGGAAG GGTATCCTGG 3 853 

AGTTAGTGGC TGGGGAGGAT TTATTCATTC ATATGTTTGT GTAGTACTGG GGAAAGAACC 3 913 

CAAACAAGAC CTTATTTATG CTAGACTGTG TTCCTAGTCC CGAGAAGACT GTACTGGCTG 3 973 

AGGTGGTGGG AATATAAGAA C TGTGGTGAC AGATTAAGGG AGGATGAACT TGAGAAC TAG 4 033 

CCATGTTGTG ATTGTGGATA TGTATCTGTC CCTCTCCGCC CCTCCTCGGG TTGTGTAGGA 4 093 

CCTCAGACAA GATCCCAAAG GGACAGGACT GATCCTCTGG CTGTACTCCA CCTTGCAG AG 4153 

Glu 

AAG GCA AGC TGG ACT AGC GAG CGG CCC CAG TTC TGG TCG AAG ACC CAG 42 01 

Lys Ala Ser Trp Thr Ser Glu Arg Pro Gin Phe Trp Ser Lys Thr Gin 
55 60 65 70 
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GTT CTG GAG TGG ATC AGC TAC CAA GTG GAG AAG AAC AAG TAT GAC GCC 424 9 

Val Leu Glu Trp lie Ser Tyr Gin Val Glu Lys Asn Lys Tyr Asp Ala 
75 80 85 

AGC TCC ATC GAC TTC TCC CGC TGC AAC ATG GAC GGA GCC ACC CTC TGC 42 97 

Ser Ser lie Asp Phe Ser Arg Cys Asn Met Asp Gly Ala Thr Leu Cys 
90 95 100 

AGC TGT GCG CTG GAG GAG CTG CGG CTA GTC TTT GGA CCT CTG GGA GAC 4345 
Ser Cys Ala Leu Glu Glu Leu Arg Leu Val Phe Gly Pro Leu Gly Asp 
105 110 115 

CAG CTC CAT GCC CAG CTT CGG GAC CTC A GTAAGTCTAG GCTGGGAGCC 43 93 

Gin Leu His Ala Gin Leu Arg Asp Leu 
120 125 

ACAGGGCCTA AAGAGTGAGC GAGGTGGCTG GGACTTGGGC AGGAGGGTGC AGCCATCGAG 44 53 

CCCCTGCCGG AACCATGGTC GGTGACGCTC TCCCTCCCTG CCTCCGCCAG CC TCC 45 0 8 

Thr Ser 

AAC TCT TCT GAT GAA CTC AGC TGG ATC ATC GAG CTG CTG GAG AAG GAT 4556 
Asn Ser Ser Asp Glu Leu Ser Trp lie lie Glu Leu Leu Glu Lys Asp 
130 135 140 145 

GGC ATG TCC TTC CAA GAG AGC CTA GGC GAC TTG GGC CCC TTT G 45 99 

Gly Met Ser Phe Gin Glu Ser Leu Gly Asp Leu Gly Pro Phe 
150 155 

GTGAGAACCC ATTTTCTCCC TTTTTCCTCC CTAGCTTGTC TTGTCCCATC TGTAACTCCT 4 65 9 

CCAGAGTGCT ACAGATATTC TCTCCCAACT TGAAAATAAG TCCATAGTCA TTTCTGTGGT 4 719 

C CC TGGAGGG TCGTGCCTGT CCTTGCTGGT ATCCTGGGCC TCTCTAAGCT CTTAACTTCT 4 77 9 

TTTCTCAG AT CAG GGA AGT CCT TTT GCC CAG GAA CTC CTG GAT GAT GGC 482 8 

Asp Gin Gly Ser Pro Phe Ala Gin Glu Leu Leu Asp Asp Gly 
160 165 170 

CGC CAG GCC AGT CCC TAC TAC TGC AGT ACC TAT GGC CCT GGA GCG CCC 4 87 6 

Arg Gin Ala Ser Pro Tyr Tyr Cys Ser Thr .Tyr Gly Pro Gly Ala Pro 
175 180 185 

TCC CCC GGC AGC TCT GAT GTC TCC ACT GCA A GTAAGTCCTG CCCTTGCCAC 4 92 7 
Ser Pro Gly Ser Ser Asp Val Ser Thr Ala 
190 195 

AGCCTGCCTT CTCCAAGTGC CCTAGAGTGC ATCGAGTTCT TACAATACTC ATTCAGTATC 4 9 87 

TGAAGTCTGG GTACGCAGTG ACTGGGTAGG CTGGCCCTGG CATTCAAGTG GTATTCTTCA 504 7 

CCCCTAG GG ACC GCT ACT CCC CAG AGT TCC CAT GCC TCT GAC TCC GGT 50 95 

Arg Thr Ala Thr Pro Gin Ser Ser His Ala Ser Asp Ser Gly 
200 205 210 

GGA AGT GAT GTG GAC CTG GAC CTC ACC GAG AGC AAG GTC TTC CCT AGA G 5144 
Gly Ser Asp Val Asp Leu Asp Leu Thr Glu Ser Lys Val Phe Pro Arg 
215 220 225 

GTGAGTTGAG GGCTGTTCTT GGGGGTCCTG TCCATGGGGT CTAGCCACTC CCCTCTGCCC 52 04 

TATGGCTGCA GTTTCTGTAC CAAGGCTCCC TGTTGACACC CTGCCCTTAC CTTCTCTTGA 52 64 

CCTTCCAACC CCCTTCCCAT AG AT GAC TTT ACT GAC TAT AAG AAG GGG GAA 5315 

Asp Asp Phe Thr Asp Tyr Lys Lys Gly Glu 
230 235 
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CCC AAG CAC GGG AAG AGG AAA CGG GGG CGT CCC AGA AAG CTG AGC AAG 5363 
Pro Lys His Gly Lys Arg Lys Arg Gly Arg Pro Arg Lys Leu Ser Lys 
240 245 250 255 



GAA TAC TGG GAC TGT CTG GAG GGC AAG AAG AGC AAG CAC G 54 03 

Glu Tyr Trp Asp Cys Leu Glu Gly Lys Lys Ser Lys His 
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PTTPTTflTPr 
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CGAGGGGAC T 


5883 


GTTAATTCCG 


GGAAGCTGTT 


TCTTGGTCCC 


TCAGGCTATA 


GGCAGCTCTC 


TGACCCCATG 


5943 


TGTGCCAAGT 


TCTCACCACC 


ACTGGTCCCC 


ACTGAACCAT 


GAGCCCCCTC 


ACAAAGAAGC 


6003 


GTGTCTCTGT 


CGCTGTCCAT 


CTTAACCAGT 


TGTTTGATCC 


TTAACTGGTG 


AGAGAATCGA 


6063 


GCGCTCTGTG 


CAGTCGGCCT 


AGCGCATTGC 


ATTTTGGGGC 


AGGAAAGGAA 


GCAGCCACTA 


6123 


TAGCAATCAC 


TAAGAGGACA 


TTT CAT AT AC 


TCCCATATGC 


CTTGGCTCTT 


AGCCTCGTTG 


6183 


GGATAGGAGA 


GGCCAGGTCG 


CCTAGAGGAG 


AGGGGCACCC 


CAGACTGATA 


ACTGAGGAAA 


6243 


TCTTCCCTTG 


TAG CC CCC AGA GGT ACT CAC CTG TGG GAG TTT 
Ala Pro Arg Gly Thr His Leu Trp Glu Phe 


ATC CGA 
lie Arg 


6291 



270 275 280 

GAC ATC CTA ATC CAC CCC GAG CTC AAC GAA GGC CTC ATG AAG TGG GAG 633 9 

Asp lie Leu lie His Pro Glu Leu Asn Glu Gly Leu Met Lys Trp Glu 
285 290 295 

AAC CGG CAC GAG GGT GTG TTC AAG TTT CTT CGC TCA GAG GCC GTG GCC 63 87 

Asn Arg His Glu Gly Val Phe Lys Phe Leu Arg Ser Glu Ala Val Ala 
300 305 310 

CAA CTC TGG GGC CAG AAG AAG AAG AAC AGC AAC ATG ACC TAT GAG AAG 6435 
Gin Leu Trp Gly Gin Lys Lys Lys Asn Ser Asn Met Thr Tyr Glu Lys 
315 320 325 

CTG AGC CGA GCC ATG AG GTGAGTGTGA GCGTCAGGGA CCTCTGCTTG 64 82 
Leu Ser Arg Ala Met Arg 
330 

GGCTCTACTG GCTTCCGCTA GGTTTCACGA GAC AGGC CTG AGGCCCGTAT GGAGAGGACA 6542 

AGGACAGTGT TGTGGCCCTG TGTAGTTGGT TACGTGCAGC ATGAAGAAAG CGCTGGGCAG 6602 

AGATCGTGAG CACACTTAGC TTT AG C TAAC ATTTCTGTGT TTCCTGCAGA CTTGTTCTAA 6662 

GAAAGACACT TGAGAGAGAG AAAGAATAGA AATTGACAGC TCAGCTCCCT TGTCTCTGGG 6722 

C C AC AAAGGT GAACTAGCTC AGCATTGCTA AAGTCCCCTC TCCCTCAGTT CACGGGCCTT 6782 

TATGAAAAGC C CC AGGAC AT AGCCAGAAGG CACAGAGAAG TAAATGTAGA AGCAGGTGCT 6842 
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CTGGCCATAA TTACAGATCA CCGCGGCCAC AACAGGTGAG GAGAGGGAAC ACTCAGGCAG 6902 

AG AGGGC C AG CTCAGCACAC TGGGGCTGGG AACCAATGCG AACCTCAGTC CATAGCATGC 6962 

CTCTTGCCTA CACCTCTGAC CACCTCCTTC CCACGCAG G TAT TAC TAC AAA CGG 7016 

Tyr Tyr Tyr Lys Arg 
335 

GAG ATC CTG GAA CGG GTG GAT GGC CGA CGG CTC GTC TAC AAG TTT GGC 7064 
Glu lie Leu Glu Arg Val Asp Gly Arg Arg Leu Val Tyr Lys Phe Gly 
340 345 350 355 

AAG AAC TCT AGT GGC TGG AAG GAA GAA GAG GTT GGA GAG AGT CGG AAT 7112 
Lys Asn Ser Ser Gly Trp Lys Glu Glu Glu Val Gly Glu Ser Arg Asn 





*5 C A 

jdu 




"5 R 
JO J 








T AAGGAT CGG 


GGCTGGACCC 


AGGACCTGAC 


TCAGGCATGA 


ACTCCAGAAC 


TGAAGCCTTC 


7172 


CTGGAAGGAC 


AGGCAGGCCT 


GACGGCCCCC 


TTAACATGGA 


TGTGTTCCCT 


GTGTTGCTGT 


7232 


AGAGAGGAAG 


AACCTGTTGG 


GCGTGCCCTC 


TGCAGTCTCC 


TCAAGTGCAG 


CCTTTGGCCT 


7292 


CTCTCCTCGC 


CCTCTTGGAA 


TTACAAGCCC 


CGGGTTTGAA 


CCAACTTGTT 


CGATAACTCT 


7352 


TCCAGCTGTG 


ATTCCAGTTC 


CCTCCCGTCC 


CAACATGGAC 


TGCAAATGAG 


ACCCACCTGC 


7412 


AGATGC C TGG 


CCTCAGCCAA 


GGAGGCTGGG 


GAGACTGTGG 


CAGGAGACTG 


CAGGGACGGA 


7472 


GGGGACAGGG 


TTGTGTCCTC 


GGTACTTCCT 


GGACTGCCTT 


CCACCTCTTT 


GCTCAGTACT 


7532 


CAGGCTCCAC 


AGACGGGGGT 


CGGATCATCC 


CTAATTTATG 


TGCTATAAAT 


ATTCCAGGTG 


7592 


TATATAGAGA 


GCTATTTTTT 


CTAAAGCATT 


TCCCCTCCCT 


GCTCTTCTCC 


ACTGAGTGCT 


7652 


GGTGGCCAGA 


CTGATTTTTT 


TTTTAGCCCC 


CCTAACTGGA 


CCAGCGAGAA 


GTAGGGTGAT 


7712 


TCCAGGACCC 


CCTCTTCCCC 


C AG AGGGGT C 


TCCTGGATCC 






7752 



(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 371 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Met Ala Ala Thr Cys Glu lie Ser Asn Val Phe Ser Asn Tyr Phe Asn 
15 10 15 

Ala Met Tyr Ser Ser Glu Asp Pro Thr Leu Ala Pro Ala Pro Pro Thr 
20 25 30 

Thr Phe Gly Thr Glu Asp Leu Val Leu Thr Leu Asn Asn Gin Gin Met 
35 40 * 45 

Thr Leu Glu Gly Pro Glu Lys Ala Ser Trp Thr Ser Glu Arg Pro- Gin 
50 55 60 

Phe Trp Ser Lys Thr Gin Val Leu Glu Trp lie Ser Tyr Gin Val Glu 
65 70 75 80 

Lys Asn Lys Tyr Asp Ala Ser Ser lie Asp Phe Ser Arg Cys Asn Met 
85 90 95 

15 



Asp Gly Ala Thr Leu Cys Ser Cys Ala Leu Glu Glu Leu Arg Leu Val 
100 105 110 

Phe Gly Pro Leu Gly Asp Gin Leu His Ala Gin Leu Arg Asp Leu Thr 
115 120 125 

Ser Asn Ser Ser Asp Glu Leu Ser Trp lie lie Glu Leu Leu Glu Lys 
130 135 140 

Asp Gly Met Ser Phe Gin Glu Ser Leu Gly Asp Leu Gly Pro Phe Asp 
145 150 155 160 

Gin Gly Ser Pro Phe Ala Gin Glu Leu Leu Asp Asp Gly Arg Gin Ala 
165 170 175 

Ser Pro Tyr Tyr Cys Ser Thr Tyr Gly Pro Gly Ala Pro Ser Pro Gly 
180 185 190 

Ser Ser Asp Val Ser Thr Ala Arg Thr Ala Thr Pro Gin Ser Ser His 
195 200 205 

Ala Ser Asp Ser Gly Gly Ser Asp Val Asp Leu Asp Leu Thr Glu Ser 
210 215 220 

Lys Val Phe Pro Arg Asp Asp Phe Thr Asp Tyr Lys Lys Gly Glu Pro 
225 230 235 240 

Lys His Gly Lys Arg Lys Arg Gly Arg Pro Arg Lys Leu Ser Lys Glu 
245 250 255 

Tyr Trp Asp Cys Leu Glu Gly Lys Lys Ser Lys His Ala Pro Arg Gly v 
260 265 270 

Thr His Leu Trp Glu Phe lie Arg Asp lie Leu lie His Pro Glu Leu 
275 280 285 

Asn Glu Gly Leu Met Lys Trp Glu Asn Arg His Glu Gly Val Phe Lys 
290 295 300 

Phe Leu Arg Ser Glu Ala Val Ala Gin Leu Trp Gly Gin Lys Lys Lys 
305 310 315 320 

Asn Ser Asn Met Thr Tyr Glu Lys Leu Ser Arg Ala Met Arg Tyr Tyr 
325 330 335 

Tyr Lys Arg Glu lie Leu Glu Arg Val Asp Gly Arg Arg Leu Val Tyr 
340 345 350 

Lys Phe Gly Lys Asn Ser Ser Gly Trp Lys Glu Glu Glu Val Gly Glu 
355 360 365 

Ser Arg Asn 
370 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
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(ix) FEATURE: 

(A) NAME /KEY : Peptide 
<B) LOCATION: 1 . .40 

(D) OTHER INFORMATION: /note= "human ESX A-region/Pointed 
domain (amino acids 64-103 of SEQ ID NO:2) M 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:17: 

Pro Gin Phe Trp Ser Lys Thr Gin Val Leu Asp Trp lie Ser Tyr Gin 
15 10 15 

Val Glu Lys Asn Lys Tyr Asp Ala Ser Ala lie Asp Phe Ser Arg Cys 
20 25 30 

Asp Met Asp Gly Ala Thr Leu Cys 
35 40 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(ix) FEATURE: 

(A) NAME/KEY: Peptide 

(B) LOCATION: 1 . . 38 

(D) OTHER INFORMATION: /note= "human ETS-1 
A-region/Pointed domain (amino acids 69-106) " 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:18: 

Pro Arg Gin Trp Thr Glu Thr His Val Arg Asp Trp Val Met Trp Ala 
15 10 15 

Val Asn Glu Phe Ser Leu Lys Gly Val Asp Phe Gin Lys Phe Cys Met 
20 25 30 

Asn Gly Ala Ala Leu Cys 
35 

(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 51 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(ix) FEATURE: 

(A) NAME /KEY : Peptide 

(B) LOCATION: 1. .51 

(D) OTHER INFORMATION: /note= "human ESX serine -rich box 
(amino acids 188-238 of SEQ ID NO:2) ,T 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

Ala Pro Ser Pro Gly Ser Ser Asp Val Ser Thr Ala Gly Thr Gly Ala 
15 10 15 

Ser Arg Ser Ser His Ser Ser Asp Ser Gly Gly Ser Asp Val Asp Leu 
20 25 30 

Asp Pro Thr Asp Gly Lys Leu Phe Pro Ser Asp Gly Phe Arg Asp Cys 
35 40 45 

Lys Lys Gly 
50 

(2) INFORMATION FOR SEQ ID NO: 20: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 51 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(ix) FEATURE: 

(A) NAME /KEY : Peptide 

(B) LOCATION: 1 . .51 

(D) OTHER INFORMATION: /note= "SOX4 serine box (amino 
acids 370-420) » 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:20: 

Ala Pro Ser Ser Ala Pro Ser His Ala Ser Ser Ser Ala Ser Ser His 
15 10 15 

Ser Ser Ser Ser Ser Ser Ser Gly Ser Ser Ser Ser Asp Asp Glu Phe 
20 25 30 

Glu Asp Asp Leu Leu Asp Leu Asn Pro Ser Ser Asn Phe Glu Ser Met 
35 40. 45 

Ser Leu Gly 
50 

(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(ix) FEATURE: 

(A) NAME/KEY: Peptide 

(B) LOCATION: 1 . . 16 

(D) OTHER INFORMATION: /note= "portion of human ESX serine 
box showing clustering of serine residues opposite a hydrophobic 
face in a helical wheel model" 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

Ser Pro Gly Ser Ser Asp Val Ser Thr Ala Gly Thr Gly Ala Ser Arg 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 81 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(ix) FEATURE: 

(A) NAME/KEY: Peptide 

(B) LOCATION: 1 . . 81 

(D) OTHER INFORMATION: /note= "human ESX Ets DNA binding 
domain (amino acids 274-354 of SEQ ID NO:2) M 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: 

His Leu Trp Glu Phe lie Arg Asp lie Leu lie His Pro Glu Leu Asn 
15 10 15 

Glu Gly Leu Met Lys Trp Glu Asn Arg His Glu Gly Val Phe Lys Phe 
20 25 30 

Leu Arg Ser Glu Ala Val Ala Gin Leu Trp Gly Gin Lys Lys Lys Asn 
35 40 45 

Ser Asn Met Thr Tyr Glu Lys Leu Ser Arg Ala Met Arg Tyr Tyr Tyr 
50 55 60 

Lys Arg Glu lie Leu Glu Arg Val Asp Gly Arg Arg Leu Val Tyr Lys 
65 70 75 80 

Phe 

(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 81 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(ix) FEATURE: 

(A) NAME /KEY : Peptide 

(B) LOCATION: 1 . . 81 

(D) OTHER INFORMATION: /note= "Elf-1 Ets DNA binding 
domain (amino acids 209-289) " 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:23: 

Tyr Leu Trp Glu Phe Leu Leu Ala Leu Leu Gin Asp Lys Ala Thr Cys 
15 10 15 

Pro Lys Tyr lie Lys Trp Thr Gin Arg Glu Lys Gly lie Phe Lys Leu 
20 25 30 

Val Asp Ser Lys Ala Val "Ser Arg Leu Trp Gly Lys His Lys Asn Lys 
35 40 45 

Pro Asp Met Asn Tyr Glu Thr Met Gly Arg Ala Leu Arg Tyr Tyr Tyr 
50 55 60 

Gin Arg Gly lie Leu Ala Lys Val Glu Gly Gin Arg Leu Val Tyr Gin 
65 70 75 80 

Phe 



(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNES S : 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:24: 

Leu Trp Gin Phe Leu Leu 
1 5 

(2) INFORMATION FOR SEQ ID NO: 2.5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNES S : 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

Lys Leu Ser Arg 
1 



(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26 

Leu Arg Tyr Tyr Tyr 
1 5 



(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27 

Leu Trp Glu Phe 
1 



(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

( D ) TO POLOGY : linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28 

Arg Tyr Tyr Tyr 
1 



(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29 

Arg Leu Val Tyr 
1 



(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 397 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:30: 

TCAGCCCTGG CCAGGCCCCC AGGAAGAATT TCCAGGGCCA GAGGGCAGCC TAAGGCACAG 60 

ATGCCCACCC CTGCAATGTT CCCGCCACAT GCCCAGTTCA GTACCCAGGG CCCAACCCCA 12 0 

GAGGGTGCGG AATGACAGAT TCTGACAATC ATTAAACCAG CCAGGCCTGA TTTCCCAGCA 180 

CCGCCCGTTA GGATATGGGC CAAGTGGCAC GGAATATGCA AATCACATGG GACAGGGAGC 24 0 

CCAGTCTGAA GGC CAGGAAA TCCCCAGCAT CCAATGAGCC ACCAGCTCAG GTTACAACCG 3 00 

GGGACGTACG CCGAAGACCT GGAGGGGAGG AGCTCCTGCT TTGCTCTATT TAGAGCGGGT 360 

GGGGGCAGCG CCCTGGCCAC ACTCATCACT GCTACCT 3 97 

(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 400 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

CCATCTCTGG CCTGGCCCCT GGGAGGAATT TCCTGGGCCA GAGGGCAGCC GAAAGCACAG 60 

ATGCCCACCC CAGCAACGTT CCCGCCACCT GCCCAGGCCA GTGCCCCGTG CCCAACCCCA 12 0 

GAGGGTGCGG GATGACAGAC TCTGACAATC ATTAAACCAG CCGGGCCTGA TTTCCCAGCA 180 

CTGCCTGCTA AGATCCGGGC CAAGTGGCAC TGAATATGCA AATCACATGG GGCCAGGAGC 24 0 

CCAGTCTAAA GGC CAGGAAA TCCCCTCCAT CCAATGAGAC ACCAGCTCAG GTTACTGCAG 300 

GGGACACACT ATAAAGCCCT GAGCTCAGGG AGGAGCTCCC TCCAGGCTCT ATTTAGAGCC 3 60 

GGGTAGGGGA GCGCAGCGGC C AG AT AC C T C AGCGCTACCT 400 

(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME /KEY : - 

(B) LOCATION: 1 . . 31 

(D) OTHER INFORMATION: /note= "TAB oligonucleotide 
containing Ets responsive element from HER2 / neu promoter" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:32: 
GGAGGAGGGC TGCTTGAGGA AGTATAAGAA T 31 



22 



(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(ix) FEATURE: 

(A) NAME / KEY : - 

<B) LOCATION: 1..31 

(D) OTHER INFORMATION: /note= "ml mutant TA5 sequence" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 
GGAGGAGGTA TGCTTGAGGA AGTATAAGAA T 31 

(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(ix) FEATURE: 

(A) NAME / KEY : - 

<B) LOCATION: 1..31 

(D) OTHER INFORMATION: /note= "m2 mutant TA5 sequence" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 
GGAGGAGGGC TGCTTGCGGA AGTATAAGAA T 31 

(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

- -(C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(ix) FEATURE: 

(A) NAME /KEY: - 

(B) LOCATION: 1. .31 

(D) OTHER INFORMATION: /note= "m3 mutant TAB sequence" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 
GGAGGAGGGC TGCTTGAGAG AGTATAAGAA T 31 
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(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: 1. .31 

(D) OTHER INFORMATION: /note= M m4 mutant TA5 sequence 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 
GGAGGAGGGC TGCTTGACCA AGTATAAGAA T 



(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(ix) FEATURE: 

(A) NAME /KEY : - 

(B) LOCATION: 1 . . 31 

(D) OTHER INFORMATION: /note= M m5 mutant TA5 sequence 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 
GGAGGAGGGC TGCTTGAGGA AGCATAAGAA T 



(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TO POLOGY : linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 
ATACTTCCTC AAGCA 



24 



